Methods and apparatuses for processing packets in a credit-based flow control scheme

ABSTRACT

Methods and systems for processing a second request before processing of a first request has completed. The first request is associated with a first flow control credit type, and the second request is associated with a second flow control credit type. After a period of time, the second request is selected for processing based on the first flow control credit type and the second flow control credit type.

BACKGROUND

1. Field of the Invention

The invention relates generally to credit-based flow control and morespecifically relates to processing packets in a credit-based flowcontrol scheme.

2. Discussion of Related Art

To manage the flow of data between a transmitter and a receiver, a flowcontrol scheme is typically used to prevent the transmitter fromtransmitting additional data to the receiver when the receiver is notable to receive the additional data. In one flow control scheme, thereceiver issues credits against which the transmitter may transmit datato the receiver. The transmitter may not transmit data to the receiverif doing so would consume more than the available credit. As thereceiver becomes able to receive additional data, the receiver issuesadditional credits to the transmitter.

One exemplary credit-based flow control scheme is used by the PeripheralComponent Interconnect (“PCI”) Express standard. In this scheme, a datatransmission is associated with a credit type for flow control purposes.There are six flow control credit types: posted request header, postedrequest data, non-posted request header, non-posted request data,completion header, and completion data. A receiving device initiallyadvertises credits available for each flow control credit type. When atransmitting device transmits packets of a particular credit type to thereceiving device, the transmitting device uses up credits of theparticular credit type. After the receiving device finishes processingthe packets, the receiving device signals the transmitting device torestore the available credit.

A PCI Express device can comprise multiple entities and theircorresponding one or more functions. Each function (or entity) caninitiate a request at the PCI Express Application Layer that results inthe transmission of multiple packets (that also uses up credits of thecorresponding credit type). Although multiple packets may be generatedin a request, the PCI Express device begins to transmit packets even ifthere are not enough credits to complete the processing of the request.

The PCI Express device verifies that there is sufficient creditavailable before transmitting each packet. If sufficient credit is notavailable to transmit a packet, processing of the request is blocked.However, this causes processing of a second request to be stalled evenif the second request would use a second credit type that is available,and the PCI Express device would thus be able to transmit packetsgenerated from the second request. Stalling the processing of the secondrequest leads to degraded performance of the PCI Express device just asmaintaining high performance of the PCI Express device is becoming morecritical in today's demanding data processing applications.

Thus it is an ongoing challenge to maintaining high performance of thePCI Express device through improved processing of packets in acredit-based flow control scheme.

SUMMARY

The present invention solves the above and other problems, therebyadvancing the state of the useful arts, by providing methods andapparatuses for processing a second request/packet descriptor beforeprocessing of a first request/packet descriptor has completed. It isnoted that a request can generally be seen as comprising a “packetdescriptor” that describes packets to be transmitted for the request.The first packet descriptor is associated with a first flow controlcredit type, and the second packet descriptor is associated with asecond flow control credit type. After a period of time, the secondpacket descriptor is selected for processing based on the first flowcontrol credit type and the second flow control credit type.Accordingly, processing of the second packet descriptor is no longerstalled, and performance is improved as a result.

In one aspect hereof, a method is provided for processing a secondpacket descriptor before processing of a first packet descriptor hascompleted. The first packet descriptor is associated with a first flowcontrol credit type, and the second packet descriptor is associated witha second flow control credit type. The method comprises processing thefirst packet descriptor and stalling the processing of the second packetdescriptor. The method also comprises selecting, after a period of time,the second packet descriptor for processing based on the first flowcontrol credit type and the second flow control credit type.Additionally, the method comprises processing the second packetdescriptor.

Another aspect hereof provides an apparatus for processing a secondpacket descriptor before processing of a first packet descriptor hascompleted. The first packet descriptor is associated with a first flowcontrol credit type, and the second packet descriptor is associated witha second flow control credit type. The apparatus comprises a processingelement for processing the first packet descriptor and the second packetdescriptor. The apparatus also comprises a stalling element for stallingthe processing of the second packet descriptor. Additionally, theapparatus comprises a selecting element for selecting, after a period oftime, the second packet descriptor for processing based on the firstflow control credit type and the second flow control credit type.

BRIEF DESCRIPTION OF THE DRAWINGS

FIG. 1 is a flowchart describing an exemplary method in accordance withfeatures and aspects hereof for processing a second packet descriptorbefore processing of a first packet descriptor has completed.

FIG. 2 is a flowchart describing exemplary additional details forselecting the second packet descriptor for processing in accordance withfeatures and aspects hereof.

FIG. 3 is a flowchart describing alternative exemplary details forselecting the second packet descriptor for processing in accordance withfeatures and aspects hereof.

FIG. 4 is a block diagram of an exemplary apparatus in accordance withfeatures and aspects hereof for processing a second packet descriptorbefore processing of a first packet descriptor has completed.

FIG. 5 is a block diagram describing exemplary additional details of aqueue processor in accordance with features and aspects hereof.

FIG. 6 is a block diagram describing exemplary additional details of apacket descriptor in accordance with features and aspects hereof.

FIG. 7 is a block diagram describing exemplary additional details of aqueue in accordance with features and aspects hereof.

FIG. 8 is a block diagram describing alternative exemplary details ofqueues in accordance with features and aspects hereof.

FIG. 9 is a block diagram describing alternative exemplary details ofqueues in accordance with features and aspects hereof.

FIG. 10 is a diagram describing a deadlock condition.

FIG. 11 is a diagram describing the deadlock condition having beencorrected in accordance with features and aspects hereof.

DETAILED DESCRIPTION OF THE DRAWINGS

FIG. 1 is a flowchart describing an exemplary method in accordance withfeatures and aspects hereof for processing a second packet descriptor(“PD”) before processing of a first PD has completed. The first PD isassociated with a first flow control credit type (“FCCT”), and thesecond PD is associated with a second FCCT. At step 110, a processorbegins processing the first PD. While the processor processes the firstPD, the processor stalls at step 120 processing of the second PD. Atstep 130, the processor selects, after a period of time, the second PDthat has been stalled for processing. The step of selecting may be basedon the first FCCT and the second FCCT as will be described in greaterdetail.

At step 140, the processor then updates the first PD to reflect a statusof the processing of the first PD. It will be understood that theprocessor may generate multiple packets based on a single PD.Accordingly, the processor may have finished transmitting/processing anumber of packets when the second PD is selected for processing afterthe period time. Step 140 thus ensures that the number oftransmitted/processed packets is reflected in the first PD, allowing theprocessor to properly process the first PD at a later time. For example,a first PD may describe that 2000 bytes of remote data starting atremote address 3500 is to be read into a local buffer starting at localaddress 1000. Suppose that packets for reading the first 500 bytes ofremote data have been transmitted when the processor selects the secondPD for processing. The processor would then update the first PD toreflect that 500 bytes of remote data have been read, and/or that 1500bytes of remote data starting at remote address 4000 remain to be readinto the local buffer starting at local address 1500. At step 150, theprocessor begins processing the selected second PD.

It is noted that in the example above, after reading the first 500 bytesof remote data, processing of the first PD may have been blocked forlack of credits of the first FCCT. However, credits of the second FCCTmay be available that would allow the processor to transmit packets thatare generated from the second PD, in that the processing of the secondPD has been stalled while the processor processes the first PD.Advantageously, selecting the second PD for processing after the periodof time improves performance as processing of the second PD wouldotherwise remain stalled. The processor or another entity may set atimer that is triggered after the period of time. Alternatively, theprocessor may time out after being blocked for the period of time. FIG.2 is a flowchart describing exemplary additional details for selectingthe second PD in step 130 of FIG. 1 for processing in accordance withfeatures and aspects hereof.

At step 210, the processor chooses the second PD from a number of PDs.The PDs may be arranged in a variety of ways, some of which will bedescribed in greater detail. For example, the PDs may be arranged inqueues each with PDs associated with a particular function. It will beunderstood that packets of a “function” may be more generally defined asa group of packets whose sequence of processing should not be reordered.In this arrangement, a first queue may comprise the first PD and asecond queue may comprise the second PD. At step 210, the processor maychoose the second PD from PDs of the second queue so that the first andthe second PDs are associated with different functions. At step 220, theprocessor checks that the first FCCT associated with the first PD isdifferent from the second FCCT associated with the second PD. If the twoPDs are associated with the same FCCT, the processor repeats the stepsfrom 210 and chooses another PD from the second queue and/or from athird queue.

FIG. 3 is a flowchart describing alternative exemplary details forselecting the second PD in step 130 of FIG. 1 for processing inaccordance with features and aspects hereof. At step 310, the processorlikewise chooses the second PD from a number of PDs. The PDs mayalternatively be arranged in queues each with PDs associated with aparticular FCCT. In this alternative arrangement, a first queue maycomprise the first PD and a second queue may comprise the second PD. Atstep 310, the processor may choose the second PD from PDs of the secondqueue so that the first and the second PDs are associated with differentFCCTs. At step 320, the processor checks that the first functionassociated with the first PD is different from the second functionassociated with the second PD. If the two PDs are associated with thesame function, the processor repeats the steps from 310 and choosesanother PD from the second queue and/or from a third queue.

Those of ordinary skill in the art will readily recognize numerousadditional and equivalent steps that may be performed and/or omitted inthe methods of FIGS. 1 through 3. Such additional and equivalent stepsare omitted herein merely for brevity and simplicity of this discussion.

FIG. 4 is a block diagram of an exemplary apparatus in accordance withfeatures and aspects hereof for processing a second PD before processingof a first PD has completed. The apparatus comprises one or morequeue(s) 410 that comprise PDs. A queue processor 420 processes PDs ofthe queue(s) 410 and generates packets for transmission from a PD.However, prior to transmitting a packet, the queue processor 420 sends arequest to a credit manager 430 to verify that sufficient credits of aFCCT associated with the PD is available. Meanwhile, the queue processor420 sets a timer to be triggered after a period of time. Alternatively,the timer may be set by the credit manager 430 upon receipt of therequest. If sufficient credit is available, the queue processor 420forwards the packet to a packet transmitter 450 for transmission to areceiver over an interconnect 460. The queue processor 420 alsocommunicates with the credit manager 430 so that the available creditfor the FCCT is decreased. Additionally, the queue processor 420 or thecredit manager 430 cancels the timer.

If sufficient credit is not available, processing of the PD becomesblocked, causing processing of other PDs to be stalled. A packetreceiver 440 may receive packets over the interconnect 460 when thereceiver signals that a number of credits of a FCCT may be restored. Thepacket receiver 440 passes this information to the credit manager 430for processing, which may in turn unblock the processing of a PD thathas been blocked for lack of credit of the FCCT. Alternatively, thetimer may be triggered after the period of time and the queue processor420 would select a second PD for processing. As another alternative, thequeue processor 420 may time out after being blocked for the period oftime, and would then also select the second PD for processing.

It will be understood that the apparatus may be a PCI Express device,and each or the elements may comprise circuitry, memory, processor,and/or instructions to perform the functions as described. Theinterconnect 460 may be a PCI Express interconnect for communicationwith a PCI Express switch.

FIG. 5 is a block diagram describing exemplary additional details of thequeue processor 420 in FIG. 4 in accordance with features and aspectshereof. The queue processor 420 comprises a processing element 510 forprocessing the first PD and the second PD. The queue processor 420 alsocomprises a stalling element 520 for stalling the processing of thesecond PD, and an updating element 530 for updating the first PD toreflect a status of the processing of the first PD. Additionally, thequeue processor 420 comprises a selecting element 540 for selecting,after a period of time, the second PD for processing based on the firstflow control credit type and the second flow control credit type. Thequeue processor 420 also comprises a timer 550 that may be set to betriggered and/or timed out after a period of time. Among its designchoices, the timer 550 may comprise counters that count between a smallvalue and an “infinite” value (i.e., does not timeout).

FIG. 6 is a block diagram describing exemplary additional details of aPD 610 in accordance with features and aspects hereof. The PD 610 isassociated with a function 620 and a FCCT 630. For example, the PD 610may comprise a record or a data structure that comprises data fields forthe function 620 and the FCCT 630 shown in FIG. 6. The function 620 maybe any function performed by an apparatus including a PCI Expressdevice. It will be understood that packets of a “function” may be moregenerally defined as a group of packets whose sequence of processingshould not be reordered. In a PCI Express device, the FCCT 630 may beany one of the following types: posted request header, posted requestdata, non-posted request header, non-posted request data, completionheader, and completion data. If credits of a particular FCCT are notavailable, a transmitter may not transmit packets of the particularFCCT.

FIG. 7 is a block diagram describing exemplary additional details of thequeue(s) 410 in FIG. 4 in accordance with features and aspects hereof.The queue(s) 410 comprises a unified queue 710. The unified queue 710comprises a first PD 711, a second PD 712, and a third PD 713. After theprocessor chooses the second PD 712, the processor would check that thesecond PD 712 is associated with a different function and a differentFCCT than the first PD 711. If either the function or the FCCT is thesame, the processor would choose the third PD 713 and perform the samechecking between the first PD 711 and the third PD 713.

FIG. 8 is a block diagram describing alternative exemplary details ofthe queue(s) 410 in FIG. 4 in accordance with features and aspectshereof. The queue(s) 410 comprises a first queue 810 and a second queue820. The first queue 810 comprises PDs associated with a first functionand the second queue 820 comprises PDs associated with a secondfunction. The first queue 810 comprises the first PD 811, and the secondqueue 820 comprises the second PD 812 and the third PD 813. After theprocessor chooses the second PD 812 from the second queue 820, theprocessor would check that the second PD 812 is associated with adifferent FCCT than the first PD 811. Otherwise, the processor wouldchoose the third PD 813 (or yet another PD from another queue) andperform the same checking between the first PD 811 and the third PD 813(or yet another PD).

FIG. 9 is a block diagram describing alternative exemplary details ofthe queue(s) 410 in FIG. 4 in accordance with features and aspectshereof. The queue(s) 410 comprises a first queue 910 and a second queue920. The first queue 910 comprises PDs associated with a FCCT and thesecond queue 920 comprises PDs associated with a FCCT. The first queue910 comprises the first PD 911, and the second queue 920 comprises thesecond PD 912 and the third PD 913. After the processor chooses thesecond PD 912 from the second queue 920, the processor would check thatthe second PD 912 is associated with a different function than the firstPD 911. Otherwise, the processor would choose the third PD 913 (or yetanother PD from another queue) and perform the same checking between thefirst PD 911 and the third PD 913 (or yet another PD).

Those of ordinary skill in the art will readily recognize numerousadditional and equivalent components and modules within a fullyfunctional apparatus. Such additional and equivalent components areomitted herein for simplicity and brevity of this discussion. Thus, thestructures of FIGS. 4 through 9 are intended merely as representativesof exemplary embodiments of features and aspects hereof.

FIG. 10 is a diagram describing a deadlock condition. In the figure, adashed arrow line shows dependency at the pointed direction. Forexample, if a first block has a dashed arrow line pointing to a secondblock, the first block depends on the second block and processing of thefirst block is queued/stalled after the second block. FIG. 10 describesinteractions between an apparatus (e.g., a PCI Express device) and aroot complex (e.g., a PCI Express end point for interfacing with acentral processor). At the apparatus, block 1010 describes transmittinga second read packet for reading from the root complex. Processing ofblock 1010 cannot complete until receiving, from the root complex, acompletion packet in response to a first read packet previouslytransmitted from the apparatus to the root complex. At the root complex,block 1020 describes transmitting a write packet for writing to theapparatus. Processing of block 1020 cannot complete until receiving,from the apparatus, a completion packet in response to a read packetpreviously transmitted from the root complex to the apparatus.

Meanwhile, at the root complex, block 1030 describes transmitting thecompletion packet in response to the first read packet previouslytransmitted from the apparatus. However, processing of block 1030 isqueued and stalled after transmitting the write packet (block 1020). Atthe apparatus, block 1040 describes transmitting the completion packetin response to the read packet previously transmitted from the rootcomplex. However, processing of block 1040 is queued and stalled aftertransmitting the second read packet (block 1010). As a result, adeadlock condition is formed as neither the apparatus nor the rootcomplex can transmit a packet while being blocked waiting to receive apacket from each other.

FIG. 11 is a diagram describing the deadlock condition having beencorrected in accordance with features and aspects hereof FIG. 11 issimilar to FIG. 10 except that block 1140 is no longer stalled afterbeing selected for processing after a period of time in accordance withfeatures and aspects hereof. Accordingly, block 1020 may be processedafter block 1140, block 1030 may be processed after block 1020, andblock 1010 may be processed after block 1030 so there is no longer adeadlock condition. It is noted that at block 1010, the reason thatprocessing cannot complete may be because there is insufficient creditof a FCCT. On the other hand, the reason that processing cannot completemay instead due to insufficient internal resources (e.g., insufficientbuffer or memory space). Accordingly, it will be understood thatfeatures and aspects hereof may be applicable not just to improveprocessing of packets and packet descriptors in a credit-based flowcontrol scheme, but also to avoid other problems including the potentialdeadlock condition including that described above.

While the invention has been illustrated and described in the drawingsand foregoing description, such illustration and description is to beconsidered as exemplary and not restrictive in character. One embodimentof the invention and minor variants thereof have been shown anddescribed. Protection is desired for all changes and modifications thatcome within the spirit of the invention. Those skilled in the art willappreciate variations of the above-described embodiments that fallwithin the scope of the invention. As a result, the invention is notlimited to the specific examples and illustrations discussed above, butonly by the following claims and their equivalents.

1. A method for processing a second packet descriptor before processingof a first packet descriptor has completed, wherein the first packetdescriptor is associated with a first flow control credit type, andwherein the second packet descriptor is associated with a second flowcontrol credit type, the method comprising: processing the first packetdescriptor; stalling the processing of the second packet descriptor;selecting, after a period of time, the second packet descriptor forprocessing based on the first flow control credit type and the secondflow control credit type; and processing the second packet descriptor.2. The method of claim 1, further comprising: updating the first packetdescriptor to reflect a status of the processing of the first packetdescriptor.
 3. The method of claim 1, wherein processing of the firstpacket descriptor is not completed in part for lack of credits of thefirst flow control credit type.
 4. The method of claim 1, whereinprocessing the first packet descriptor comprises setting a timer that istriggered after the period of time.
 5. The method of claim 1, whereinthe step of selecting comprises checking that the first flow controlcredit type is different from the second flow control credit type. 6.The method of claim 1, wherein the first packet descriptor is associatedwith a first function and the second packet descriptor is associatedwith a second function, and wherein the first function is different fromthe second function.
 7. The method of claim 1, wherein the first packetdescriptor describes at least one Peripheral Component Interconnect(“PCI”) Express packet that is to be generated.
 8. The method of claim1, wherein the first flow control credit type comprises one of postedrequest header, posted request data, non-posted request header,non-posted request data, completion header, and completion data.
 9. Themethod of claim 1, wherein: a first queue comprises the first packetdescriptor and a second queue comprises the second packet descriptor;the first queue comprises packet descriptors associated with a firstfunction; the second queue comprises packet descriptors associated witha second function; and the step of selecting comprises: choosing thesecond packet descriptor from the second queue; and checking that thefirst flow control credit type is different from the second flow controlcredit type.
 10. The method of claim 1, wherein: the first packetdescriptor is associated with a first function and the second packetdescriptor is associated with a second function; a first queue comprisesthe first packet descriptor and a second queue comprises the secondpacket descriptor; the first queue comprises packet descriptorsassociated with a first flow control credit type; the second queuecomprises packet descriptors associated with a second flow controlcredit type; and the step of selecting comprises: choosing the secondpacket descriptor from the second queue; and checking that the firstflow function is different from the second function.
 11. An apparatusfor processing a second packet descriptor before processing of a firstpacket descriptor has completed, wherein the first packet descriptor isassociated with a first flow control credit type, and wherein the secondpacket descriptor is associated with a second flow control credit type,the apparatus comprising: a processing element for processing the firstpacket descriptor and the second packet descriptor; a stalling elementfor stalling the processing of the second packet descriptor; and aselecting element for selecting, after a period of time, the secondpacket descriptor for processing based on the first flow control credittype and the second flow control credit type.
 12. The apparatus of claim11, further comprising: an updating element for updating the firstpacket descriptor to reflect a status of the processing of the firstpacket descriptor.
 13. The apparatus of claim 11, wherein processing ofthe first packet descriptor is not completed in part for lack of thefirst flow control credit type.
 14. The apparatus of claim 11, whereinthe processing element comprises a setting element for setting a timerthat is triggered after the period of time.
 15. The apparatus of claim11, wherein the selecting element comprises a checking element forchecking that the first flow control credit type is different from thesecond flow control credit type.
 16. The apparatus of claim 11, whereinthe first packet descriptor is associated with a first function and thesecond packet descriptor is associated with a second function, andwherein the first function is different from the second function. 17.The apparatus of claim 11, wherein the first packet descriptor describesa Peripheral Component Interconnect (“PCI”) Express packet.
 18. Theapparatus of claim 11, wherein the first flow control credit typecomprises one of posted request header, posted request data, non-postedrequest header, non-posted request data, completion header, andcompletion data.
 19. The apparatus of claim 11, further comprising afirst queue comprising the first packet descriptor and a second queuecomprising the second packet descriptor, wherein: the first queuecomprises packet descriptors associated with a first function; thesecond queue comprising packet descriptors associated with a secondfunction; and the selecting element comprises: a choosing element forchoosing the second packet descriptor from the second queue; and achecking element for checking that the first flow control credit type isdifferent from the second flow control credit type.
 20. The apparatus ofclaim 11, wherein the first packet descriptor is associated with a firstfunction and the second packet descriptor is associated with a secondfunction, the apparatus further comprising a first queue comprises thefirst packet descriptor and a second queue comprises the second packetdescriptor, wherein: the first queue comprises packet descriptorsassociated with a first flow control credit type; the second queuecomprises packet descriptors associated with a second flow controlcredit type; and the selecting element comprises: a choosing element forchoosing the second packet descriptor from the second queue; and achecking element for checking that the first function is different fromthe second function.